How to use GeoPressureR: an example
basic_example.RmdUsing a basic example, we walk through the main steps of the methodology to demonstrate how it can be used.
Preparing the data
To start, install the GeoPressureR package from Github using the following line:
devtools::install_github("Rafnuss/GeoPressureR")We will be using the following libraries:
library(GeoPressureR)
library(raster)
library(leaflet)
library(ggplot2)
library(plotly)
library(RColorBrewer)Reading geolocator data
In this example, we use data captured on a Great Reed Warbler Acrocephalus arundinaceus (18LX). Below, we read the geolocator data and crop to equipment and retrieval date.
pam_data = pam_read(pathname = system.file("extdata", "/", package = "GeoPressureR"),
crop_start = "2017-06-20", crop_end = "2018-05-02")Automatic classification of activity
We use a simple k-mean clustering to define periods of relatively high activity, and then classify high activities lasting more than 30 minutes as migratory activities. See more possible classifications described in the PALMr manual.
pam_data = pam_classify(pam_data, min_duration = 30)Editing on TRAINSET
To ensure the high level of precision needed for the pressure match, we must manually edit the activity classification and the pressure timeseries to be matched. We suggest doing this with TRAINSET. A separate vignette dedicated to this exercise, including best practices and a sample code to get started, is available here.
Use trainset_write() to export the automatically generated classifications in a csv file, which can be opened in TRAINSET: https://trainset.geocene.com/.
trainset_write(pam_data, pathname=system.file("extdata", "/", package = "GeoPressureR"))
# browseURL("https://trainset.geocene.com/")
Printscreen of the manual classification in TRAINSET. See the labeling track vignette for more information.
When you have finished the manual editing, export the new csv file (TRAINSET will add -labeled in the name). Make sure to keep this classification file (e.g. under /data/).
To re-edit an existing labeled file, you can simply re-open the file on TRAINSET and read this file directly with trainset_read().
pam_data = trainset_read(pam_data, pathname=system.file("extdata", "/", package = "GeoPressureR"))Identifying stationary periods
Based on the activity labeling, pam_sta() creates a table of stationary periods as illustrated below.
| start | end | duration | next_flight_duration | sta_id |
|---|---|---|---|---|
| 2017-06-20 00:00:00 | 2017-08-04 19:50:00 | 1099.83333 hours | 205 mins | 1 |
| 2017-08-04 23:15:00 | 2017-08-05 19:30:00 | 20.25000 hours | 440 mins | 2 |
| 2017-08-06 02:50:00 | 2017-08-06 19:15:00 | 16.41667 hours | 475 mins | 3 |
| 2017-08-07 03:10:00 | 2017-08-07 19:15:00 | 16.08333 hours | 295 mins | 4 |
| 2017-08-08 00:10:00 | 2017-08-29 18:40:00 | 522.50000 hours | 590 mins | 5 |
| 2017-08-30 04:30:00 | 2017-08-30 18:45:00 | 14.25000 hours | 565 mins | 6 |
We can visualize the pressure measurements for each grouped stationary period (symbolized by the same color).
p <- subset(pam_data$pressure, sta_id != 0) %>%
ggplot() +
geom_point(data=pam_data$pressure, aes(x=date,y=obs), colour="black",size=0.25) +
geom_line(aes(x=date,y=obs,col=as.factor(sta_id))) +
theme_bw() +
scale_y_continuous(name="Pressure(hPa)") +
scale_colour_manual(values=rep(RColorBrewer::brewer.pal(9,"Set1"),times=4))
#scale_colour_brewer(type="qualitative", palette = 'Set1')
ggplotly(p, dynamicTicks = T)Compute map of pressure
Now that we can see the clean smooth pressure timeseries for each stationary period, we are finally ready to match each one with atmospheric pressure data (ERA5). This R package is relying on the API GeoPressure to overcome the challenge of computing mismatch on such a large dataset by performing the compution on Google Earth Engine. Read more about the API here
Initially, it is easier and faster to query only long stationary periods. You can do that by setting the stationay period of pressure to NA.
sta_id_keep = pam_data$sta$sta_id[difftime(pam_data$sta$end,pam_data$sta$start, units = "hours")>12]
pam_data$pressure$sta_id[!(pam_data$pressure$sta_id %in% sta_id_keep)] = NAWe can now query the data on the API with geopressure_map(). A detailed description of the parameters can be found here
extent = c(-16,20,0,50) # coordinates of the map to request (W,E,S,N)
scale = 10 # request on a 0.1° grid to make the code faster
max_sample = 250 # limit the query to the first 250 datapoints.
margin = 30 # roughly equivalent to 3hPa
raster_list = geopressure_map(pam_data$pressure, extent = extent, scale = scale, max_sample = max_sample, margin = margin)geopressure_map() returns a list of two rasters for each stationary periods. The first one is the mean square error (\(\textbf{MSE}\)) between the pressure timeseries and ERA5 map. The second one (\(\textbf{z}_{thr}\)) is the proportion of datapoint of the pressure timeserie which correspond to altitude within the min and max altitude found in each grid cell. Read more about these values abd how they are computed here.
We then combine the two raster into a probability map with, \[\textbf{P} = \exp \left(-w \frac{\textbf{MSE}}{s} \right) [\textbf{z}_{thr}>thr]\] where \(s\) is the standard deviation of pressure and \(thr\) is the threasholv . Because the auto-correlation of the timeseries is not accounted for in this equation, we use a log-linear pooling weight \(w=\log(n) - 1\), with \(n\) is the number of datapoint in the timeserie. This operation is describe in publication […]. I will write another vignette to describe the influence of the log-linear pooling and length of the time series later.
s = 0.4 # standard deviation of pressure
thr = 0.9 # threashold of the threahold proportion value acceptable
prob_map_list = geopressure_prob_map(raster_list, s=s, thr=thr)We can use leaflet() to visualize the maps of mismatch, threshold and probability for a single stationary period.
i_r = 1;
leaflet() %>% addTiles() %>%
addRasterImage(prob_map_list[[i_r]], opacity = 0.8, group="Probability") %>%
addRasterImage(raster_list[[i_r]][[1]], opacity = 0.8, group="Mismatch") %>%
addRasterImage(raster_list[[i_r]][[2]], opacity = 0.8, group="Threashold") %>%
# addLegend(pal = pal, values = values(v[[i_s]][[3]]), title = "Probability") %>%
addLayersControl(
overlayGroups = c("Probability","Mismatch","Threashold"),
options = layersControlOptions(collapsed = FALSE)
) %>% hideGroup(c("Mismatch","Threashold"))We can also visualize for all stationary period (only the probability)
li_s=list()
l = leaflet() %>% addTiles()
for (i_r in 1:length(prob_map_list)){
i_s = metadata(prob_map_list[[i_r]])$sta_id
info = pam_data$sta[pam_data$sta$sta_id==i_s,]
info_str = paste0(i_s," | ",info$start,"->",info$end)
li_s <- append(li_s, info_str)
l = l %>% addRasterImage(prob_map_list[[i_r]], opacity = 0.8, group=info_str)
}
l %>%
addLayersControl(
overlayGroups = li_s,
options = layersControlOptions(collapsed = FALSE)
) %>% hideGroup(tail(li_s,length(li_s)-1))Compute altitude
The second operation that you can perform with GeoPressure is to compute the exact altitude of the bird \(z_{gl}\) from its pressure measurement \(P_{gl}\) and assuming its location \(x\). This function uses ERA5 to adjust the barometric equation, \[ z_{gl}(x)=z_{ERA5}(x) + \frac{T_{ERA5}(x)}{L_b} \left( \frac{P_{gl}}{P_{ERA5}(x)} \right) ^{\frac{RL_b}{g M}-1},\] Where \(z_{ERA}\), \(T_{ERA}\) and \(T_{ERA}\) are the ground level elevation, temperature at 2m and ground level pressure of ERA5, \(L_b\) is the standard temperature lapse rate, \(R\) is the universal gas constant, \(g\) is gravity constant and \(M\) is the molar mass of air. See more information here
We can compute the elevation of the bird for the first stationary period using the most likely position from the probability map.
i_r = 1
i_s = metadata(prob_map_list[[i_r]])$sta_id
tmp = as.data.frame(prob_map_list[[i_r]],xy=T)
lon = tmp$x[which.max(tmp[,3])]
lat = tmp$y[which.max(tmp[,3])]We can extract the pressure for this stationary period
id = pam_data$pressure$sta_id==i_s & !is.na(pam_data$pressure$sta_id)
pressure = list(
obs = pam_data$pressure$obs[id],
date = pam_data$pressure$date[id],
class = pam_data$pressure$class[id]
)And finally call the function geopressure_ts
ts_list[[i_r]] = geopressure_ts(lon, lat, pressure = pressure)We can compare the altitude produce to the one computed without the correction for temperature and pressure.
Lb = -0.0065
R = 8.31432
g0 = 9.80665
M = 0.0289644
T0 = 273.15+15
P0 = 1013.25
ts_list[[i_r]]$altitude_baro = T0/Lb * ((ts_list[[i_r]]$pressure/P0)^(-R*Lb/g0/M) - 1 )Visualize
p <- ggplot() +
geom_line(data=as.data.frame(ts_list[[i_r]]), aes(x=date,y=altitude, col=as.factor("Corrected elevation with ERA5"))) +
geom_line(data=as.data.frame(ts_list[[i_r]]), aes(x=date,y=altitude_baro, col=as.factor("Uncorrected elevation"))) +
labs(col="") +
theme_bw()
ggplotly(p)The function geopressure_ts() also returns the ground level pressure timeseries from ERA5 at the location specified. This is useful to check the good match between the pressure measured by the geolocator and the one at the assumed location. This operation is typically used to check the good quality of the manual labeling (see the labeling track vignette).